weight and bias
Functional Equivalence and Path Connectivity of Reducible Hyperbolic Tangent Networks
Understanding the learning process of artificial neural networks requires clarifying the structure of the parameter space within which learning takes place. A neural network parameter's functional equivalence class is the set of parameters implementing the same input-output function. For many architectures, almost all parameters have a simple and well-documented functional equivalence class. However, there is also a vanishing minority of reducible parameters, with richer functional equivalence classes caused by redundancies among the network's units. In this paper, we give an algorithmic characterisation of unit redundancies and reducible functional equivalence classes for a single-hidden-layer hyperbolic tangent architecture. We show that such functional equivalence classes are piecewise-linear path-connected sets, and that for parameters with a majority of redundant units, the sets have a diameter of at most 7 linear segments.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- North America > Canada > Ontario > Toronto (0.14)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Oceania > Australia > New South Wales > Sydney (0.14)
- Europe > Switzerland > Zürich > Zürich (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- (9 more...)
- North America > United States > Texas > Brazos County > College Station (0.14)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
- Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)
Model Zoos: A Dataset of Diverse Populations of Neural Network Models
In the last years, neural networks (NN) have evolved from laboratory environments to the state-of-the-art for many real-world problems. It was shown that NN models (i.e., their weights and biases) evolve on unique trajectories in weight space during training. Following, a population of such neural network models (referred to as model zoo) would form structures in weight space. We think that the geometry, curvature and smoothness of these structures contain information about the state of training and can reveal latent properties of individual models. With such model zoos, one could investigate novel approaches for (i) model analysis, (ii) discover unknown learning dynamics, (iii) learn rich representations of such populations, or (iv) exploit the model zoos for generative modelling of NN weights and biases.
Conditional updates of neural network weights for increased out of training performance
Saynisch-Wagner, Jan, Sari, Saran Rajendran
In physics, especially in geosciences and climate sciences, the poor performance of neural networks (NN) when applied outside their training distribution or their trained dynamics poses a very strong limitation to their general applicability (Irrgang et al., 2021; Landsberg and Barnes, 2025). In these fields, physical relations such as laws, dependencies or sensitivities are commonly derived (or learned) under well observed conditions and are then applied to less observed conditions to gain knowledge about the latter. For example, results from lab or numerical model experiments are regularly applied to real world problems or observations (e.g., Mehta et al., 2025); knowledge from our Earth and our Solar System are transferred to other planets and other star systems (e.g., Kvorka et al., 2026); learned relations that are derived today are transferred to the distant past or to the future (e.g., Eyring et al., 2016; Wang et al., 2024; Koutsodendris et al., 2014).
- Europe > Germany > Brandenburg > Potsdam (0.40)
- Atlantic Ocean (0.04)
- South America > Brazil (0.04)
- (5 more...)
- Education (0.69)
- Health & Medicine (0.46)
- North America > United States > Texas > Brazos County > College Station (0.14)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
- Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)
Neuronal Fluctuations: Learning Rates vs Participating Neurons
Pareek, Darsh, Kumar, Umesh, Rao, Ruthu, Janjam, Ravi
Deep Neural Networks (DNNs) rely on inherent fluctuations in their internal parameters (weights and biases) to effectively navigate the complex optimization landscape and achieve robust performance. While these fluctuations are recognized as crucial for escaping local minima and improving generalization, their precise relationship with fundamental hyperparameters remains underexplored. A significant knowledge gap exists concerning how the learning rate, a critical parameter governing the training process, directly influences the dynamics of these neural fluctuations. This study systematically investigates the impact of varying learning rates on the magnitude and character of weight and bias fluctuations within a neural network. We trained a model using distinct learning rates and analyzed the corresponding parameter fluctuations in conjunction with the network's final accuracy. Our findings aim to establish a clear link between the learning rate's value, the resulting fluctuation patterns, and overall model performance. By doing so, we provide deeper insights into the optimization process, shedding light on how the learning rate mediates the crucial exploration-exploitation trade-off during training. This work contributes to a more nuanced understanding of hyperparameter tuning and the underlying mechanics of deep learning.
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Middle East > Israel > Haifa District > Haifa (0.04)